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Parametrical Neural Network 

Abstract 

The storage capacity of the Hopfield model is about 15% of the net- 
work size. It can be increased significantly in the Potts-glass model of 
the associative memory only. In this model neurons can be in more 
than two different states. We show that even greater storage capac- 
ity can be achieved in the parametrical neural network (PNN) that is 
based on the parametrical four- wave mixing process that is well-known 
in nonlinear optics. We present a uniform formalism allowing us to 
describe both PNN and the Potts-glass associative memory. To esti- 
mate the storage capacity we use the Chebyshev-Chernov statistical 
technique. 



Keywords: Associative Memory, Phase- Frequency Modulation, Optical Net- 
works, Chebyshev-Chernov Method. 
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1 INTRODUCTION 

In refs. a network based on the parametrical four- wave mixing pro- 

cess (FWM) || that is well-known in nonlinear optics was examined. Such 
a network is capable to hold and handle information that is encoded in the 
form of the phase-frequency modulation. In the network the signals propa- 
gate along interconnections in the form of quasi-monochromatic pulses at q 
different frequencies 

= {U) X ,U 2 , ...,UJ q }. (1) 

The model is based on a parametrical neuron that is a cubic nonlinear element 
capable to transform and generate frequencies in the parametrical FWM- 
processes Ui — uij + Uf. — > u r . Schematically this model of a neuron can be 
assumed as a device that is composed of a summator of input signals, a set 
of q ideal frequency filters {uji} q , a block comparing the amplitudes of the 
signals and q generators of quasi- monochromatic signals {oj{\ q . 

Let {KV*) }^ be a set of patterns each of which is a set of quasi-monochromatic 
pulses with frequencies defined by Eq.(l) and amplitudes equal to ±1: 

r fi= l,...,p; 

= {K { f\...,K$), where 4 M) = ±exp(^ (M) t), { i = l,...,JV; (2) 

1 1 < t } < q. 

The memory of the network is localized in interconnections T^, i,j — l,...,N, 
which accumulate the information about the states of ith and jth neurons in 
all the p patterns. We suppose that the interconnections are dynamic ones 
and that they are organized according to the Hebb rule: 

^; = (l-<y E^f*, '•./ 1 v - (3) 

The network operates as follows. A quasi-monochromatic pulse with a fre- 
quency oji- that is propagating along the (ij)-th interconnection from the jth 
neuron to the ith one, takes part in FWM-processes with the pulses stored 
in the interconnection, 

The amplitudes ±1 have to be multiplied. Summing up the results of these 
partial transformations over all patterns, \i — 1, . . . ,p, we obtain a packet of 



PARAMETRICAL NEURAL NETWORK 



4 



quasi- monochromatic pulses, where all the frequencies from the set (1) are 
present. The amplitudes of the pulses are determined by the interconnection. 
This packet is the result of transformation of the pulse by the intercon- 
nection Tij, and it comes to the ith neuron. All such packets are summarized 
in this neuron. The summarized signal propagates through q parallel ideal 
frequency filters. The output signals from the filters are compared with re- 
spect to their amplitudes. The signal with the maximal amplitude activates 
the i-th neuron ('winner-take- all'). As a result it generates an output signal 
whose frequency and phase are the same as the frequency and the phase of 
the activating signal. 

Generally, when three pulses interact, under a FWM-process always the 
fourth pulse appears. The frequency of this pulse is defined by the conserva- 
tion laws only. However, in order that the abovementioned model works as 
a memory, an important condition must be add, which has to facilitate the 
propagation of the useful signal, and, in the same time, to suppress external 
noise. This condition is the principle of incommensurability of frequencies 
proponed in 0,0]: no combinations uj\ — coy + lui» can belong to the set (1), 
when all the frequencies are different. Now we finished to describe the prin- 
ciple of the network operating. This network will be called the parametrical 
neural network (PNN). 

There are arguments going in for PNN. First of all, the frequency-phase 
modulation is more convenient for optical processing of signals. It allows 
us to back down an artificial adaptation of an optical network to ampli- 
tude modulated signals. Second, when signals with q different frequencies 
can propagate along one interconnection (this is an analog of the channel 
multiplexing), this, in fact, allows us to reduce the number of interconnec- 
tions by a factor of q 2 . Note, interconnections occupy nearly 98% of the 
area of neurochips. Third, the signal-noise analysis made with the aid of the 
Chebyshev- Chernov statistical method showed that the storage capacity of 
PNN was approximately q 2 times as much as the storage capacity of the Hop- 
field model. Even if q ~ 10, the gain is two orders. For computer processing 
of colored images the standard value is q = 256. Consequently, comparing 
with the Hopfield model the gain is about five orders. Simultaneously with 
an increase of the storage capacity, the noise immunity of the network also 
increases. For example, we simulated PNN with the following parameters: 
N = 100, q = 22 and p = 200. This network recognized any 80% noisy 
pattern after 100 steps (in fact, in one pass over all the neurons). The same 
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network with the parameters N = 100, q = 25, p = 1000(!) recognized a 
65% noisy pattern in 4-5 passes over all the neurons. We remind that some 
time ago the ability of the Hopfield model with N = 400, p = 30 to recognize 
a 30% noisy pattern was presented as a high mark in the patterns recognition 

1- 

In the present work we investigate the abilities of PNN. Here an important 
remark has to be done. Generally speaking, there are different parametri- 
cal FWM-processes complying with the principle of incommensurability of 
frequencies. For example in [|l|,@ the parametrical FWM-process of the type 

u)i», when V = I] 
oji — toy + uyi — { 0J\, when V = I"; 

^ in other cases. 

was examined. The corresponding network will be called PNN-I. However, 
better results can be obtained for the parametrical FWM-process 

u - uj , _|_ u „ = { wli en/' = l"\ ^ 

1 — > in other cases. 

This network will be called PNN-II. 

The organization of the paper is as follows. In Section 2 we introduce a 
vector formalism allowing us to formulate the problem in the general form. 
In this section the results for PNN-II are presented. In Section 3 the vector 
formalism is used to examine the Potts-glass neural network. We compare it 
with PNN-II. Some remarks are given in Conclusions. The details of calcu- 
lations are in Appendix. 



2 PNN-II 

In fact, PNN is an associative memory of the Hopfield type with neurons, 
which can be in more than two different states. Such models of neural net- 



works were examined previously (see, for example, @- ||12|| ). Usually neurons 
are modeled with the aid of vectors, but not scalar quantities equal to ±1 or 
0/1. The number of the representative vectors is equal to the number of dif- 
ferent states of neurons. In the present Section for PNN-II we formulate the 
vector formalism and then estimate the storage capacity of such a network. 
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2.1 Vector Formalism 



In order to describe the q different states (I) of neurons we use the set of 
basis vectors e/ in the space R q , q > 1, 



( \ 



, l = l,...,q. 



V J 



The state of the ith neuron is described by a vector Xj, 

Xj = x^, Xj = ±1, e k E R q , 1 < Z» < g, i = 1, . . . , N. 

The state of the network as a whole X is determined by a set of N q- 
dimensional vectors x^: X = (xi, . . . , Xjv). By analogy with Eq.(2) the p 
stored patterns are 



X ( ^ 



» 



±1, 



1 < < q, 
H = \,...,p. 



Since in this model neurons are vectors, the local field hj affecting the ith 
neuron is a vector too. By analogy with the standard Hopfield model we 
write 



N 



(5) 



The (q x g)-matrix describes the interconnection between the ith and the 
jth neurons. This matrix affects the vector Xj e R q , converting it in a linear 
combination of basis vectors ej. This combination is an analog of the packet 
of quasi-monochromatic pulses that come from the jth neuron to the ith one 
after transformation in the interconnection (see Introduction). To satisfy the 
conditions (3) and (4), we need to take the matrices as 



- ij 



(i-%)Exr } x 

n=i 



(a0,»+ 



(6) 
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where 5ij is the Kronecker symbol. The elements of these matrices are 
if } = (1 - <y E(e fc x^)(xf e,) fc, Z = 1, . . . , q. 

Let us define the dynamics of our g-dimensional neurons. Let X(t) = 
(x-i(t) : . . . ,xjv(i)) be the state of the system at the time t. By definition 
the ith neuron at the time t + 1 is oriented along a direction mostly close 
to the local field hj(£). Let us clarify this definition. With the aid of (6) we 
write Eq.(5) in the form more convenient for analysis: 

q N p 

h t (t) =Efe ; , where A? = £ £(e,xf )(xf x,(t)). (7) 
i=i 3 \+i) n=i 

Let k be the index relating to the amplitude that is maximal in modulus in 
the series (7): 

I |= max | Af> \ . 
l<l<q 

Then according to our definition 

x l (t + l) = sgn(4 i) )e fc . (8) 

The expression (8) is identical to the 'winner-take-all' rule of Introduction. 

The evolution the system consists of consequent changes of orientations 
of vector- neurons according to the rule (8). We make the convention that 
if some of the amplitudes are maximal in modulus simultaneously, and the 
neuron is in one of these unimprovable states, its state does not change. Then 
it is easy to show that during the evolution of the network its energy H(t) = 
— 1/2 Z)ili(hj(t)xj(t)) decreases. In the end the system reaches a local energy 
minimum. In this state all the neurons Xj are oriented in an unimprovable 
manner, and the evolution of the system come to its end. These states are 
the fixed points of the system. The necessary and sufficient conditions for a 
configuration X to be a fixed point is fulfillment of the set of inequalities: 

(Xihi) >| (e,hi) |, V/ = l,...,g; Vi = l,...,iV. (9) 
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2.2 Storage capacity of PNN-II 

Let us estimate the storage capacity of the network in the limit N >> 1. 
Suppose that the network starts from a distorted mth pattern 

= (oiSixl m) , a 2 b 2 xt\ • • • , Giv&ivx^). 

Here {a,i}i and {fej}^ define a phase noise and a frequency noise respectively: 
aj is a random value that is equal to —1 or +1 with the probabilities a and 
1 — a respectively; b is the probability that the operator hi changes the state 
of the vector x| m ^ = x[^e ^ , and 1 — b is the probability that this vector 
remains unchanged. 

Let us examine to what extent the neural network recognizes the pattern 
X( m ) correctly. The amplitudes A® (7) have the form 



4 W_f if"^ + Ei^/f ), when/ = r>; 



where & = a,-(xf >S,ocf >), ^(Z) ee ^(Z) = a,-(e,^>)(xf S,-xf >), j = 
1, . . . , AT, j 7^ i, /i = 1, . . . ,p, \i 7^ to. For simplicity, when writing the 
quantities rj in place of the superscript fi and the subscript j we use the 
subscript r which takes L — (N — — 1) different values r = 1, . . . , L. 

Let us note that when the patterns {X^}\ are uncorrelated, the quanti- 
ties £j and i] r can be considered as independent random variables described 
by the probability distributions 



+1, (l- a )(l-6) 
0, b , rj r (l) 

-1 (l-a)6 



2 



0, 1 - l/q 2 . (11) 
-1 l/2q 2 



Since the distributions of the quantities r) r (l) are independent of Z, in what 
follows we simply write i] r . According to the rule (9), the ith neuron finds 
itself in the state xj m ^ when two conditions for the amplitudes (10) are ful- 
filled: 

N-l L L 

Sgn(A ( i) ) = xj m) , Yl Zj + ^ J2 r lr>\J2 r lr\ ■ 
i j'=l r=l r=l 

Otherwise there will be an error in the recognition of the vector xf™\ Since 
the random variable x\ m ^r) r has the same distribution as r] r , the probability 
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of this error is 

!JV-1 l \ 

£& + E^<° • (12) 
j=l r=l J 

To estimate the value of Pr^ we use the well-known Chebyshev- Chernov 
method (see Appendix). As a result we obtain the expression for 

the probability of the error in the recognition of the pattern X^: 

Pr„ = iVexp (- N{1 ~ p 2a? ■ g 2 (l - &) 2 ) (13) 

When N increases, this probability tends to zero, if p as function of N in- 
creases slower than 

(") 

This allows us to use (14) as an asymptotically possible value of the storage 
capacity of PNN-II. 

When q — 1, Eqs.(13)-(14) transform into well-known results for the 
standard Hopfield model (in this case there is no frequency noise, 6 = 0). 
When q increases, the probability of the error (13) decreases exponentially, 
i.e. the noise immunity of PNN increases noticeably. In the same time the 
storage capacity of the network increases proportionally to q 2 . In contrast to 
the Hopfield model the number of the patterns p can be much greater than 
the number of neurons. 

For example, let us set a constant value Pr err = 0.01. In the Hopfield 
model, with this probability of the error we can recognize any of p = N/10 
patterns, each of which is less then 30% noisy. In the same time, PNN-II 
with q = 64 allows us to recognize any of p = 5N patterns with 90% noise, 
or any of p = 50N patterns with 65% noise. 

In Fig.l we give an example of the restoration of 90% distorted pattern 
(a = 0, b = 0.9). Here the parameters of the network are N = 100, p = 200, 
q = 32. The pattern is a picture of a dog. The gray squares are noisy pixels. 
The states of the network after 50 and 100 steps are shown. 

In Fig. 2 for different values of q we show the dependence of the probability 
of a pattern recognition P rec = 1 — Pr err as function of the frequency noise 
b = b ■ 100%, b G [0, 1], when a = p/N = 2 (solid line); the phase noise is 
equal to zero, a = 0. We see that if q = 20, we can recognize correctly any 
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pattern when the noise is less than 70%, and if q = 30, any pattern when the 
noise is less than 85%. Generally, if the noise is less than a critical value b c , 

bc = l-lM (15) 

PNN can recognize a noisy pattern for sure, and if b > b c , the probability of 
recognition tends to zero. Our computer simulations confirm these results. 



3 Potts-glass neural networks 

The models of associative memory with neurons that can be in more than two 
different states have been investigated by a lot of authors 0- [fL^| . All these 
models are related with the Potts model of magnetic. The last generalizes the 
Ising model for the case of the spin variable that takes q > 2 different values 
[Oj,[Il|. In all these works the authors used the same well-known approach 
relating the Ising model with the Hopfield model (see, for example, ||1 5|| ) . 
Namely, in place of the short-range interaction between two nearest spins 
the Hebb type interconnections between all vector-neurons were used. As 
a result, long-range interactions appeared. Then in the mean- field approxi- 
mation it was possible to calculate the statistical sum and, consequently, to 
construct the phase diagram. Different regions of the phase diagram were 
interpreted in the terms of the ability of the network to recognize noisy pat- 
terns. 

Among all the models of g-state associative memory, characteristics of 
the anisotropic Potts-glass neural network (APGNN) |J, [11], [ E| are most 



close to PNN-II. In other models the storage capacity is less than even for the 
Hopfield model. Below we describe APGNN in terms of our vector formalism 
and compare it with PNN-II. 

APGNN consists of N neurons each of which can be in q different states. 
Now to describe the states of the neurons in place of the basis vectors e R q 
(see Subsection 2.1) g-dimensional vectors of a special type are used. Namely, 
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the lih state of a neuron is described by a column- vector d; e R q , 

/ - 1 \ 



Q 



V -1 / 



, l = l,...,q. 



The state of the i-th neuron is described by a vector Xj = d^, 1 < Zj < q, 
i — 1, . . . , N. The state of the network as a whole X is determined by a set 
of iV g-dimensional vectors x^: X = (xi, . . . , xjv). The p stored patterns are 

^ 1 = (xf' ) l ..,x} , ) 1 x?° = d lW , l<^<g, A* = l,2,...,p. 

N 

The local field hj affecting the i-th neuron is the vector hj = Ty-x,-, where 

(g x g)-matrices describe the interconnections between the i-th and the 
j'-th neurons. These matrices are 



1,...,N. 



The same as in Subsection 2.1 the dynamics of APGNN is defined as follows: 
the i-th neuron at the next time step t + 1 is oriented along a direction 
mostly close to the local field hj(t) at the time t. During the evolution of the 
network the energy H(t) = -l/2££i(hi(t)xi(t)) decreases. The necessary 
and sufficient conditions for a configuration X to be a fixed point is fulfillment 
of the set of inequalities: 



(X;h;) > (d;hi 



N. 



We see that PNN-II and APGNN are much alike. The difference between 
these models is that, first, in APGNN the vectors d; are nonorthogonal, and, 
second, in APGNN there are no amplitudes ±1 relating with the vectors d;. 

When q = 2, APGNN is the same as the standard Hopfield model 0. 
Repeating the argumentation of Subsection 2.2. we can estimate the storage 
capacity of APGNN for N ^> 1. We must only take into account that there 
is no phase noise {a{\i in this model. The distorted mth pattern has a form 

X^ = (6 lX S m) 



ON^N J 
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As above, the random operator 6, with the probability b changes the state of 
the vector x.\ m \ and with the probability 1 — b this vector remains unchanged. 
Now, the probability of the error in the recognition of the vector is 

!N-1 L L 

Efc + £*(*?°)<£*(0 
j=l r=l r=l 

where the independent random variables £j = (xj m ^6jxj m ''), r] r = (djxj^)(x^Sj 
are distributed 

f (g-l)/g, 1/g 2 



(g-l)/g, 1-6 
-1/g, 6 



1/g, (g-l)/g 2 

0, (g-2)/g 

-1/g, (g-l)/g 2 
-(g-l)/g, 1/g 2 



Naturally, it is true for the randomized patterns {X^}i, only. 

Similarly to calculations of Appendix we obtain the expression for the 
probability of the error in the recognition of the pattern X^ m \ 

Pr e „ = iVexp(-|-fc^(l-6) 2 ), b = -^b. (16) 

Then the asymptotically possible value of the storage capacity of APGNN is 

Pc = N q{q - l \l-bf. (17) 
Fc 21niV 2 y J K J 

When q = 2, these expressions give the known estimates for the Hopfield 
model. For q > 2 the storage capacity of APGNN is q(q — l)/2 times as 
large as the storage capacity of the Hopfield model. In || the same factor 
was obtained by fitting the results of numerical calculations. Our approach 
allows us to obtain the same result rigorously. 



4 Conclusions 

For q >> 1 the storage capacity of APGNN is two times less than the storage 
capacity of PNN-II (compare Eq.(17) with Eq.(14) for a ~ 0). When cal- 
culating the probability of the error in the recognition, the additional factor 
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two appears in the exponent (see Eqs.(13),(16)). This leads to a significant 
decrease of a noise immunity of APGNN comparing with PNN-II. This is well 
seen in Fig. 2, where for APGNN the dashed line shows the dependence of the 
pattern recognition P rec on the value of the frequency noise b under the same 
conditions as for PNN-II (the solid line). The superiority of PNN-II is easily 
seen, especialy in the region of not so large values of q ~ 10. For APGNN 
the critical value of the noise b c (15) is less than the analogous characteristic 
of PNN-II by a quantity . 

In conclusion, we would like to note that our approach allows us to de- 
scribe not only the optical neural networks of the parametrical type, but also 
neural networks in which information is encoded in the form of phase delays 
of pulses in interconnections. It is much more easy to realize such a network 
in form of a device. 



5 Appendix 

The following equation is true for the probability Pr; (12): 

!N-1 L ] f JV_1 L 

£ + E Vr < = Pr - £ - £ Vr > 
3=1 r=l ) \ j=l r=l 

Using the known approach of exponential estimates of the Chebyshev type, 
for any positive z>0we obtain: 



Pr* < exp | z | - E & XX ) ] = ( ex P(-^i) (exp(-^ r )) P J 



The over-line means an averaging over all possible realizations, and the last 
equality follows from independence of the random variables £j and i] r . 

Taking into account the distributions (11), it is easy to obtain the averages 



exp(-^j) = (l-a)(l-b)e- z +b+a(l-b)e z , exp(-z7] r ) = e~ z /2q 2 +l-l/q 2 +e z /2q 2 . 
Changing the variables e z = x and introducing functions fi(x) and f2(x), 



fl (x)=a(l-b)x + b+ {1 a f h \ / 2 ( a; ) = 5 L(x + i)+l-l 
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we obtain that for any positive x the following estimate is valid: 

Pr i <(f 1 (x)fr 1 (x)) N ~\ (18) 

To obtain the minimal possible value of the probability Pr,, we need to find 
the value of the variable x minimizing the right-hand side of Eq.(18). This 
leads us to the equation 

w 9 \ ail — b)x 2 — (1 — a)(l — b) , 9 , 9 
(p - l)(x 2 - 1 + , I — r^—, — w ' A x 2 + 2 (q 2 - l)x + 1=0. 
Ky A ; a(l-b)x 2 + bx + (l-a)(l-by w ; ; 

When p ^> 1, the proper root of this equation up to the terms of the order 
of 1/p is equal to X\ = 1 + q 2 {l — 2a) (1 — b)/(p— 1). Substituting this value 
of x in Eq.(18), we obtain 

/ g 2 (l-2a) 2 (l-6) 2 \ iV " 1 / Ml - 2a) 2 2/ 7x2 \ 

This inequality gives the estimate (13) for the probability of the error in the 
recognition of the pattern X^ m \ 



List of Figures 

Fig.l. The restoration of the pattern with 90% frequency noise (b = 0.9), 
when iV = 100, p = 200 and q = 32. The pattern is a picture of a dog. The 
gray squares are noisy pixels. The states of the network after 50 and 100 
steps are shown. 

Fig. 2. The probability of the pattern recognition P rcc = 1 — Pr err versus 
frequency noise b = 6-100%, b G [0, 1] for different values of q and a = p/N = 
2 for PNN-II (solid line) and for the Potts-glass neural network (dashed line). 
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